39 research outputs found

    A Novel Transformer Network with Shifted Window Cross-Attention for Spatiotemporal Weather Forecasting

    Full text link
    Earth Observatory is a growing research area that can capitalize on the powers of AI for short time forecasting, a Now-casting scenario. In this work, we tackle the challenge of weather forecasting using a video transformer network. Vision transformer architectures have been explored in various applications, with major constraints being the computational complexity of Attention and the data hungry training. To address these issues, we propose the use of Video Swin-Transformer, coupled with a dedicated augmentation scheme. Moreover, we employ gradual spatial reduction on the encoder side and cross-attention on the decoder. The proposed approach is tested on the Weather4Cast2021 weather forecasting challenge data, which requires the prediction of 8 hours ahead future frames (4 per hour) from an hourly weather product sequence. The dataset was normalized to 0-1 to facilitate using the evaluation metrics across different datasets. The model results in an MSE score of 0.4750 when provided with training data, and 0.4420 during transfer learning without using training data, respectively.Comment: 16 pages, 7 figures, 7 table

    Computation of Heterogeneous Object Co-embeddings from Relational Measurements

    Get PDF
    Dimensionality reduction and data embedding methods generate low dimensional representations of a single type of homogeneous data objects. In this work, we examine the problem of generating co-embeddings or pattern representations from two different types of objects within a joint common space of controlled dimensionality, where the only available information is assumed to be a set of pairwise relations or similarities between instances of the two groups. We propose a new method that models the embedding of each object type symmetrically to the other type, subject to flexible scale constraints and weighting parameters. The embedding generation relies on an efficient optimization dispatched using matrix decomposition, that is also extended to support multidimensional co-embeddings. We also propose a scheme of heuristically reducing the parameters of the model, and a simple way of measuring the conformity between the original object relations and the ones re-estimated from the co-embeddings, in order to achieve model selection by identifying the optimal model parameters with a simple search procedure. The capabilities of the proposed method are demonstrated with multiple synthetic and real-world datasets from the text mining domain. The experimental results and comparative analyses indicate that the proposed algorithm outperforms existing methods for co-embedding generation

    Physics-Driven ML-Based Modelling for Correcting Inverse Estimation

    Full text link
    When deploying machine learning estimators in science and engineering (SAE) domains, it is critical to avoid failed estimations that can have disastrous consequences, e.g., in aero engine design. This work focuses on detecting and correcting failed state estimations before adopting them in SAE inverse problems, by utilizing simulations and performance metrics guided by physical laws. We suggest to flag a machine learning estimation when its physical model error exceeds a feasible threshold, and propose a novel approach, GEESE, to correct it through optimization, aiming at delivering both low error and high efficiency. The key designs of GEESE include (1) a hybrid surrogate error model to provide fast error estimations to reduce simulation cost and to enable gradient based backpropagation of error feedback, and (2) two generative models to approximate the probability distributions of the candidate states for simulating the exploitation and exploration behaviours. All three models are constructed as neural networks. GEESE is tested on three real-world SAE inverse problems and compared to a number of state-of-the-art optimization/search approaches. Results show that it fails the least number of times in terms of finding a feasible state correction, and requires physical evaluations less frequently in general.Comment: 19 pages, the paper is accepted by Neurips 2023 as a spotligh

    Brain Tumor Segmentation in Fluid-Attenuated Inversion Recovery Brain MRI using Residual Network Deep Learning Architectures

    Get PDF
    Early and accurate detection of brain tumors is very important to save the patient's life. Brain tumors are generally diagnosed manually by a radiologist by analyzing the patient’s brain MRI scans which is a time-consuming process. This led to our study of this research area for finding out a solution to automate the diagnosis to increase its speed and accuracy. In this study, we investigate the use of Residual Network deep learning architecture to diagnose and segment brain tumors. We proposed a two-step method involving a tumor detection stage, using ResNet50 architecture, and a tumor area segmentation stage using ResU-Net architecture. We adopt transfer learning on pre-trained models to help get the best performance out of the approach, as well as data augmentation to lessen the effect of data population imbalance and hyperparameter optimization to get the best set of training parameter values. Using a publicly available dataset as a testbed we show that our approach achieves 84.3% performance outperforming the state-of-the-art using U-Net by 2% using the Dice Coefficient metric

    Data Augmentation Using Generative Adversarial Networks to Reduce Data Imbalance with Application in Car Damage Detection

    Get PDF
    Automatic car damage detection and assessment are very useful in alleviating the burden of manual inspection associated with car insurance claims. This will help filter out any frivolous claims that can take up time and money to process. This problem falls into the image classification category and there has been significant progress in this field using deep learning. However, deep learning models require a large number of images for training and oftentimes this is hampered because of the lack of datasets of suitable images. This research investigates data augmentation techniques using Generative Adversarial Networks to increase the size and improve the class balance of a dataset used for training deep learning models for car damage detection and classification. We compare the performance of such an approach with one that uses a conventional data augmentation technique and with another that does not use any data augmentation. Our experiment shows that this approach has a significant improvement compared to another that does not use data augmentation and has a slight improvement compared to one that uses conventional data augmentation

    Semantic Segmentation and Depth Estimation of Urban Road Scene Images Using Multi-Task Networks

    Get PDF
    In autonomous driving, environment perception is an important step in understanding the driving scene. Objects in images captured through a vehicle camera can be detected and classified using semantic segmentation and depth estimation methods. Both these tasks are closely related to each other and this association helps in building a multi-task neural network where a single network is used to generate both views from a given monocular image. This approach gives the flexibility to include multiple related tasks in a single network. It helps reduce multiple independent networks and improve the performance of all related tasks. The main aim of our research presented in this paper is to build a multi-task deep learning network for simultaneous semantic segmentation and depth estimation from monocular images. Two decoder-focused U- Net-based multi-task networks that use a pre-trained Resnet-50 and DenseNet-121 which shared encoder and task-specific decoder networks with Attention Mechanisms are considered. We also employed multi-task optimization strategies such as equal weighting and dynamic weight averaging during the training of the models. The corresponding models’ performance is evaluated using mean IoU for semantic segmentation and Root Mean Square Error for depth estimation. From our experiments, we found that the performance of these multi-task networks is on par with the corresponding single-task networks

    Application of Convolutional Neural Networks for Automated Ulcer Detection in Wireless Capsule Endoscopy Images.

    Get PDF
    Detection of abnormalities in wireless capsule endoscopy (WCE) images is a challenging task. Typically, these images suffer from low contrast, complex background, variations in lesion shape and color, which affect the accuracy of their segmentation and subsequent classification. This research proposes an automated system for detection and classification of ulcers in WCE images, based on state-of-the-art deep learning networks. Deep learning techniques, and in particular, convolutional neural networks (CNNs), have recently become popular in the analysis and recognition of medical images. The medical image datasets used in this study were obtained from WCE video frames. In this work, two milestone CNN architectures, namely the AlexNet and the GoogLeNet are extensively evaluated in object classification into ulcer or non-ulcer. Furthermore, we examine and analyze the images identified as containing ulcer objects to evaluate the efficiency of the utilized CNNs. Extensive experiments show that CNNs deliver superior performance, surpassing traditional machine learning methods by large margins, which supports their effectiveness as automated diagnosis tools
    corecore